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Abstract 

In this paper we introduce a method for determining local interaction rules 
in animal swaims. The method is based on the assumption that the behavior 
of individuals in a swarm can be treated as a set of mechanistic rules. 

The principal idea behind the technique is to vary parameters that define a 
set of hypothetical interactions to minimize the deviation between the forces 
estimated from observed animal trajectories and the forces resulting from the 
assumed rule set. We demonstrate the method by reconstructing the interac- 
tion rules from the trajectories produced by a computer simulation. 

Key words: swarming, behavioral rules, reverse engineering, force match- 
ing. 
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The collective motion of living organisms, as manifested by flocking birds, 
schooling fish, or swarming insects, presents a captivating phenomenon believed 
to emerge mainly from local interactions between individual group members. In 
part, the study of swarming and flocking aims to understand how animals use vi- 
sual, audial and other cues to orient themselves with respect to the swarm of which 
they are part, and how the properties of the swarm as a whole depend on the be- 
haviors of the individual animals. Also when addressing evolutionary questions of 
behaviour in swarms and flocks, such as the selective advantage of being bold or 
shy in response to a predator, it is important to understand how the individuals be- 
have based on the relation to their neighbours in the swarm or flock. For example, 
if the question is "If the peripheral of the flock is more exposed to predators, do 
some individuals cheat the others by staying at the center of the flock where they 
are more protected?", knowing the effective rules would make it easier to address 
questions regarding the evolutionary stability of the altruistic behavior. 

Because flocks cannot be understood by studying individuals in isolation, and 
are difficult to conduct controlled experiments on, understanding the behavioural 
patterns underlying flocking and swarming is especially challenging. Consequently, 
collective behavior has been extensively modeled particularly using the agent-based 
modeling framework, where simple mechanistic behav i oral rules are used to g ener- 
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43 force to avoid collisions with obstacles or other animals; a force adjusting the veloc- 

44 ity to fit nearby individuals' velocities; and a force for avoiding being alone, e.g. by 

45 moving towards the aver age position of the nearby individuals. However, see e.g. 



Romanczuk et al.l (|2009[) for an alternative formulation. In addition, drag forces 

47 and noise are used to model the medium through which the individuals move, and 

48 external forces can be used to model interactions with terrain or predators. 

49 The main strength of the agent-based modeling framework is the relative ease by 

50 which swarming behavior emerges from local interactions. This is however also its 

51 Achilles heel: Alternative mechanisms can generate visually similar swarming pat- 

52 terns. To reveal the effective interactions among swarming individuals of a specific 

53 species, several studies have introduced static quantitative observables, such as the 

54 distribution of inter-individual distances, swarm density, polarity, sharply defined 
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59 These techniques can provide valuable insights into aspects of what type of 

60 interaction a group of animals use, but in general cannot be used to reveal both the 

61 type and the strength of the interactions. We suggest here how this can be made 

62 possible by reverse-engineering the interactions directly from observed trajectories. 

63 For this purpose, we adapt the force-matching (FM) technique originally introduced 

64 to obt ain simplified force fields in complex molecular simulations ( Ercolessi and 



65 Adams, 



1994 



Noid et al. 



2008alJbl) . This technique minimizes the mean squared 



66 difference between observed total forces (estimated from the trajectories) and a set 
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67 of force hypotheses building on knowledge about the system under consideration. 

68 For more details, see the "Materials and Methods" section. 

69 The FM method relies on dynamical data, and to this date no such data of large 

70 swarms exists publicly available. While the technical challenges of obtaining tra- 

71 jectory data are demanding, there is currently an increased effort in collecting large 



Ballerini et al. 



72 scale da ta sets, as demonstrated by the STARFLAG project (see e.g. 

73 (|2008ah ). where flocks consisting of thousands of starlings above Rome were pho- 

74 tographed, mapping the coordinates of the individual starlings. In the absence of 

75 field data, we use simulations of swarm models to produce underlying data. This 

76 way of testing the FM method also serves as a necessary proof of principle of the 

77 proposed framework. 

78 As a demonstration problem, we fo cus on distingu i shing b etween two compet- 



2008ah : In the first altema- 



79 ing hypotheses for animal interactions (|Ballerini et al.L 

80 tive, called the 'geometrical' hypothesis, swarm individuals base their movement 

81 decisions on the relative positions and velocities of neighboring individuals inside 

82 a sphere with fixed radius. The alternative hypothesis, called 'topological', is that 

83 individuals use a fixed number of closest neighbors i n the flock to perform th e same 



Ballerini et al. 



(l2008ah . where 



84 task. The demonstration is motivated by the results in[ 

85 a different method was used to infer that the interactions among flocking starlings 

86 are topological. In that study it was also argued that each bird is interacting with its 

87 6 or 7 nearest neighbors, but the detailed nature of the interactions could not be in- 

88 ferred. The ideal would be to use reconstructed trajectories of individual starlings, 

89 but this is unfortunately not available at present (Ballerini, personal communica- 

90 tion). In lieu of this data we apply our method to simulated flocking dynamic under 
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91 the geometrical and topological scenarios. Our analysis proceeds in two steps. First, 

92 we show that both the geometric and the topological scenario are very difficult to 

93 tell apart by fitting simulations of the geometric scenario using topological forces. 

94 Second, we demonstrate how including all forces in the fitting process solves this 

95 problem, and allows one to assess the relative power of either scenario to explain 

96 the varition in the forces observed in the sampled trajectories. 



9. MATERIALS AND METHODS 

9B Generation of trajectory data for force matching 

99 The trajectory data used in testing the FM methodology are generated from com- 

100 puter simulations of swarm models. Two different scenarios are set up and sim- 

101 ulated: One in which each individual interact through geometric interactions, the 

102 other in which the individuals follow topological interaction rules. The geometrical 

103 scenario is modeled using the following equations of motion: 

d^r- r 1 

£ finj) tij + ai {\j - \i I nj < r,.) + a2{rj - r,- 1 nj < r,.) - yv,- + /3C,■(^)■ 



J^2 



(1) 



106 Here r, is the position of individual /, is the distance between individuals i and 

107 J, Vjj is the normalized direction vector from i to j, f{r) is the collision avoidance 

108 force, «! and Ui define the strength of the velocity matching respective positional 

109 preference forces, 7 is the drag force relative to the ambient medium and (^{t) is 
no a noise vector. The geometrical scenario is manifested in the averages, e.g. (r/ — 
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111 r, I rjj < fc), which involve all neighboring individuals within a cut-off radius r^. 

112 The noise vector is necessary in combination with the drag force to set the average 

113 speed of the individuals. Each component of the noise vector is an independent 

114 random variable, uncorrelated between individuals and in time. 

115 In the topological scenario, the equations of motion are nearly identical, 

d^r- r 1 

116 ^ = £ f{rij)rij + a3{\j-\i\nij<N) + a4{rj-ri\nij<N) - yv/ + /3^,(0, 



(2) 



iiB but the averages now involve the N closest neighbors, and 054, similar to a\ and 

119 a2, define the strength of the velocity matching respective positional preference 

120 forces. For simplicity, we take the collision avoidance force in both the geometrical 

121 and topological scenarios to be a linearly decreasing function: 

f{r) = -CD{l-r/R,) (3) 

123 when r < Rc, and is zero outside this range. The simulations are run with 200 

124 individuals for 15000 unit time steps in a cube with side length L = 50 and periodic 

125 boundary conditions. The parameters are set to: (0 = 0.1, i?c = 5, ai = a2 = oc^ = 

126 ^4 = 0.1, 7=0.1, /3 =0.1,rc = 4, andA^ = 7. 

127 The force matching method 

12B For a particle system where the trajectories are sampled with a time resolution ade- 

129 quate to decide the acceleration of each particle, the FM method is a useful tool to 
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investigate the structure of the effective interactions. As stated in the introduction, 
the FM method is based on minimizing the mean squared difference between the 
observed force on a particle and the force resulting from a set of force hypothe- 
ses. Mathematically, by representing the observed total force on particle i at time t 
by the force vector F/(f) and the corresponding force predicted by hypothesis h by 
F''(?), this amounts to finding a minimum of the expression 



where the average runs over local configurations in both space and time. The force 
at time t, F,(?), is estimated using a finite difference approximation of the accelera- 
tion from three consecutive time steps: 



Setting up the force hypotheses generally requires knowledge about the system 
that is examined. Contrary to molecular particle systems, it is obviously impossible 
to explain the interactions between animals in terms of the fundamental theories 
of physics. This implies that the FM method applied to collective animal systems 
should focus not only on setting the parameters right for a given choice of inter- 
actions rules, but also to find and distinguish between biologically plausible force 
hypotheses. 

We show how the FM method can be used to assess the capability of competing 
mechanistic models to explain observed motion, by applying the method to the 
problem of distinguishing whether individuals in a swarm follows geometrical or 
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151 topological interaction rules (IBallerini et all I2008af) . and to provide an estimate of 



152 the actual interaction parameters. First, suppose that repulsive interactions occur 

153 over a range < r < Rc . Partitioning this interval into A'^i equal bins, any smooth 

154 function can be approximated with a constant value within each bin. Thus, the 

155 estimated repulsive force can be written as /(r) = Y!kL\'^kh{^)^ where is an 

156 indicator function which is one when r belongs to bin k and is zero otherwise. 

157 Second, to distinguish between the two scenarios for a simulated swarm, we let 
15B the hypothesized forces F''^ include both the geometrical and topological scenarios. 
159 Using [Equations (l")]and|(2)|this leads to the following statistical model for the force 



on particle 

j^i k=l k=l k=l 



N2 Ni 



162 + 12 4 (v; - V/ 1 riij <Nk) + Y, ^^k (v j - v,- 1 nj < r,.^k) 



163 



k=l k=l 



fv„ (6) 



164 where parameters to / are unknown parameters to be estimated. They corre- 

165 spond to, respectively, collision avoidance (ak), moving to average position using 

166 topological hypothesis (bk), moving to average position using geometrical hypothe- 

167 sis (ck), aligning velocity using topological hypothesis (dt), aligning velocity using 

168 geometrical hypothesis (ek), and the dissipative force (/). Because the stochastic 

169 forces are uncorrelated in time and between individuals, they affect only the vari- 

170 ance and not the mean value in the minimization process, and need therefore not be 

171 included. This is of course true for the simulated swarm, but might not hold for real 

172 systems. 

173 Note that multiple hypotheses are set up for both the geometrical and topolog- 
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174 ical scenarios. As not only the strengths of the interactions are unknown, but also 

175 the interaction range, this motivates the inclusion of several hypotheses, spanning 

176 a wider interaction range. The cut-off values r^k are A^3 values equally spaced be- 

177 tween a minimum and maximum hypothetical cutoff radius (in this paper, 3 and 

178 5, respectively). Note that the value of Vc actually used in the simulations of the 

179 geometric scenario (i.e. Vc = 4) falls within this range, so that, if successful, the 

180 method is expected to find the correct parameter. In the topological scenario, we 

181 take Nk = k for k = 1, . . . , 10. When applying the FM method to trajectory data 



182 from simulated swarms, with individuals following respectively Equation (1) (for 



183 the geometrical hypothesis) and [Equation (2)| (for the topological hypothesis), we 

184 use the values Ni = N2 = 10, and A^3 = 12. We use the same value of 5t in the FM 

185 procedure as in the actual simulations. 



186 The minimization problem Equation (4) can in general be solved iteratively, 

187 using e.g. the Newton-Raphson method. However, when F'^(?) is linear in its pa- 

188 rameters, as is the case for Equation (6)[ the problem reduces to a linear least square 

189 problem min^ \ \Ax — y\\^, where x is the parameter vector and y is the observed ac- 

190 celerations. The minimization problem is equivalent to solving an overdetermined 

191 linear equation system Ax = y , where A i s an nxm matrix, n ^ m, and is solved by 



192 X = (A^ A) ^A^y, see (|Press et al 



1996 ) for details. From the solution of the FM 



193 method, one can determine the relative importance of the hypothesized forces by 

194 the magnitude of the corresponding parameters, so that the force that best correlates 

195 with the observed trajectories is favored over competing hypotheses. 
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RESULTS 

197 Using trajectory data from simulations of swarming particles, following geometri- 



es cal and topological interaction rules [Equations (1) and (2) in Materials and Meth- 



199 ods, respectively], we test the adapted FM method's performance for reconstructing 

200 known interactions. The results are summarized in |FiguresT| and [21 Black dots and 



201 intervals show averages and standard deviations, respectively, from eight indepen- 

202 dent simulations. The figures are divided into sections for the different parameter 

203 groups, separated by dashed lines, showing the estimated coefficients of the differ- 

204 ent forces. From left to right in each panel: Section a shows the estimated collision 

205 avoidance force as a function of distance r/y between the animals. Sections b and d 

206 shows the coffecients of the forces 'moving to average position' and 'velocity align- 

207 ment' under the topological hypothesis, respectively, as a function of the maximum 

208 number of interacting neighbours (A'^). Sections c and e are analogous to sections 

209 b and d, respectively, but shows the coefficients under the geometric hypothesis, as 

210 a function of the cutoff-radius rc. Finally, section f shows the strength of the dissi- 

211 pative force. The forces corresponding to the true scenario simulated is shown as a 

212 solid line in section a, and by solid circles in panel a-f. The circles show the values 

213 of the coefficients used in the simulations (all other coefficients are zero). 



214 Figure 1 shows how the method performs when simulating one scenario but 

215 fitting it to the other scenario. In panel A the particles move according to the geo- 

216 metrical scenario, but only the topological forces are fitted. Conversely, in panel B 

217 the particles move according to the topological scenario, but only the geometrical 

218 forces are fitted. In both cases, the coefficients of the fitted forces are significantly 

219 different from zero, and if fitted independently one would therefore conclude that 

10 



the wrong hypothesis is supported by the data. 



221 Figure 2 shows the same simulations as Figure 1 , but with all coefficions fit- 

222 ted simultaneously. Under both the geometric and topological scenario, the method 

223 identifies the forces used in the simulations and correctly estimates their magnitude. 

224 Together, these results demonstrate the importance of fitting all parameters simulta- 

225 neously. In general, if the scope of the range of forces tested against is too narrow, 

226 one may be mislead into accepting a false hypothesis because parameters can be 

227 found that generate a reasonable fit. However, when more than one hypothesis is 

228 included the magnitudes of the fitted forces show how much of the variation in the 



229 trajectories can be attributed to the hypothesis the forces correspond to: In Figure 2 

230 it is apparent from the relative magnitude of the coefficients which hypothesis is 

231 favoured over the other. 

232 Despite appearances the scattering of coefficients with low magnitude in section 



233 c of both panels in Figure 2 is not random; not only are the coefficients significantly 

234 different from zero but take very similar values in both panel A and panel B. This 

235 structure is an effect of the repulsive force being equal in both cases. 

236 DISCUSSION 

237 Our findings can be summarized in two main points: First, we suggest to adapt and 

238 apply the force matching method from multi-scale molecular dynamics to test and 

239 calibrate hypotheses about the local rules governing collective motion in swarms 

240 and schools. Second, based on swarm simulations, we demonstrate that when using 

241 the FM method on observed data, it is essential to include all candidate hypotheses 



11 



242 simultaneously in the regression. Because several different mechanisms often give 

243 rise to similar behavior, the candidate hypotheses can give speciously good results 

244 when fitted independently, and it is therefore difficult to say which is the more 

245 likely candidate. When fitted simultaneously, however, the strengths of the different 

246 forces provide a measure of the explanatory power of the particular hypothesis in 

247 relation to the competing hypotheses. 

248 The force matching is efficient if all individuals can be classified, with individ- 

249 uals in a class following identical rules (in this paper we have only considered one 

250 class). In practice a classification could be based on e.g. morphology, such as size. 

251 However, if the rule variation is caused by traits that are harder to distinguish, e.g. 

252 genetic variation, it poses a much larger challenge to the method. Another possi- 

253 ble problem is that the interaction rules could be non-deterministic. However, this 

254 is not a fundamental problem, since it should be relatively straight forward to ex- 

255 pand the framework to include stochastic interaction rules. This is currently work 

256 in progress. 

257 The need for high temporal resolution may seem as a severe restriction on the 

258 FM method. In practice however, a mitigating factor is that trajectories need only 

259 be piecewise continuous and must not necessarily include all individuals in the 

260 swarm. For each data point we need only three images closely spaced in time 



261 [c.f. Equation (5) |, from which accelerations and velocities can be calculated in ad- 

262 dition to the distances between the animals. Thus, if the setup allows for collecting 

263 images in bursts, one will obtain piecewise estimates of the quantities necessary to 

264 apply the force matching. As long as the rules underlying the observed trajectories 

265 do not change, it does not matter that there are gaps in the observed trajectories - 
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although the data can be used more effectively if continuous trajectories should be 
available. 

The method presented in this paper should be seen as a theoretical basis for 
how to infer interaction rules in animal swarms but it could also possibly find gen- 
eral applicability in a wide variety of systems often modeled u sing mechanisti c 



272 traffic (|Helbingl . 



rules. Examples other than swarming includ e human crowdi ng (iHelbing , 



2000) . 



20021). While the method 



19981) . and possibly finance (iLeBaronl 
has been shown to work for simulated swarms, it has not yet been applied to field 
data, and therefore it is currently not possible to conclude how well the method 
handles the natural variability found in biological systems. Collecting dynamical 
data is a technical challenge, and at present such data is not available (at least not 
publicly). Several efforts in this direction is however underway and we are currently 
involved in testing the proposed method on trajectories of schooling fish. 
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FIGURE LEGENDS 



Figure 1 



351 Force parameters for two models estimated from trajectories using the FM method 



352 with only parts of the force hypotheses in Equation (6) In A, the FM method us 



353 ing only topological hypothesis applied to data from simulation with geometrical 



354 interactions [Equation (1) with rc = 4]. In B, vice versa [topological interactions 



according to Equation (2) with A'^ = 7]. See the text for the remaining parameters. 
The results were obtained from averaging over 8 simulations run with 200 individ- 
uals for 15000 time steps in a cube with side length L = 50 and periodic boundary 
conditions. The different sections labelled a, b, c, d, e, and f correspond to the pa- 



359 rameters Uk to / in Equation (6) Within each section, k increases from left to right. 



Each letter represents a behavioral rule, and the subscripts corresponds to force hy- 
potheses: collision avoidance (a), moving to average position using topological rule 
(b), moving to average position using geometrical rule (c), aligning velocity using 
topological rule (d), aligning velocity using geometrical rule (e), and dissipative 
force (f). The parameters estimated with the FM method are plotted as black mark- 
ers (•) with error bars (± one standard deviation). In section a, the repulsive force 



used in the simulations, see Equation (3) is plotted as a solid line. In the remaining 
sections, the exact parameter values used in the simulations are plotted as empty 
circles (o). Only the non-zero values are shown. 



Figure 2 



Estimating all force parameters simultaneously, from simulations of the geometric 
scenario (panel A) and the topological scenario (panel B). The results were obtained 
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373 from averaging over 8 simulations run with 200 individuals for 15000 time steps in 



374 a cube with side length L = 50 and periodic boundary conditions. See |Figure2| for 

375 explanation of symbols, an the text for parameter values. 
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